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Quaternions are an important tool to describe the orientation of a molecule. This paper considers the use of 
quaternions in matching two conformations of a molecule, in interpolating rotations, in performing statistics on 
orientational data, in the random sampling of rotations, and in establishing grids in orientation space. These 
examples show that many of the rotational problems that arise in molecular modeling may be handled simply and 
efficiently using quaternions. 
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I. INTRODUCTION 

Quaternions were introduced in the mid-nineteenth cen- 
tury by Hamilton (H 0] as an extension of complex numbers 
and as a tool for manipulating 3-dimensional vectors. Indeed 
Maxwell used them to introduce vectors in his exposition of 
electromagnetic theory |3, §§10-11]. However, unlike com- 
plex numbers which occupy a central role in the development 
of algebra, quaternions found no similar place in mathematics 
and, with the introduction of modern vector notation by Gibbs 
0|, quaternions fell out of favor by the end of the nineteenth 
century. Nevertheless, quaternions excel as a way of repre- 
senting rotations of objects in 3-dimensional space. They are 
economical to work with (both in terms of storage and com- 
putation); but more importantly they offer a clean conceptual 
framework which allow several problems involving rotations 
to be easily solved. 

Basic quaternion algebra is well covered in Hamilton's pa- 
pers 01I3> which are both accessible and readable. These pa- 
pers may be supplemented with a wealth of on-line resources 
|5|; |fj. Many authors over the past 20 years have "redis- 
covered" the application of quaternions to rotations and it is 
with some trepidation that this author inflicts another paper 
on the subject on the scientific community. However, within 
the molecular modeling community, quaternions are quite nar- 
rowly applied. This paper therefore briefly reviews quater- 
nion algebra and then describes their applications to a broad 
range of rotational problems in molecular modeling. Much of 
this material has appeared before — but often scattered about 
in journals for fields unrelated to molecular modeling. I have, 
therefore, endeavored to organize the material, to generalize 
it, and to present it with a consistent notation, with the hope 
this affords a deeper appreciation of the power of quaternions 
in describing rotations and encourages their wider adoption in 
molecular modeling. 

The outline of this paper is as follows. After introducing 



quaternions and their use in describing rotations, we tackle 
various applications. First we review the quaternion method 
for computing the least-squares fit of two conformations of 
the same molecule. We also see how to include molecular in- 
versions and discuss why the least-squares fit is a poor choice 
to describe the orientation of a flexible molecule. We next 
show how to interpolate smoothly between two orientations 
and that this corresponds to rotating the molecule at constant 
angular velocity. In order to carry out statistics on orienta- 
tional data, we give a robust definition of the mean orientation 
showing how to transform the deviations from the mean to 
3-dimensional space so that familiar statistical tools may be 
employed. In Monte Carlo applications, we need to be able 
to select a random orientation uniformly; we show that this is 
trivially accomplished in quaternion space and we also con- 
sider the problem of making random incremental rotations. 
Finally, it is frequently useful to impose a grid on orientation 
space and we illustrate how this may be done with applica- 
tions to quadrature and searching. 



II. QUATERNIONS 

The original notation for quaternions JlJ paralleled the con- 
vention for complex numbers 

q = q u + qx'l + q 2 } + q 3 k, 

which obey the conventional algebraic rule for addition and 
multiplication by scalars (real numbers) and which obey an 
associative non-commutative rule for multiplication where u 
is the identity element and 



i 2 =j 2 



U 



k 2 

jk = kj 



ik=j. 



It is frequently useful to regard quaternions as an ordered set 
of 4 real quantities which we write as 
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q = [qo, qi, 92, 93], 



(1) 
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or as a combination of a scalar and a vector 

q = [<Zo,q], (2) 

where q = [gi, q%, q 3 ] . A "scalar" quaternion has zero vector 
part and we shall write this as [qo,0] = qoii = qo. A "pure" 
quaternion has zero scalar part [0, q]. 

In the scalar- vector representation, multiplication becomes 

Pq = [Po9o - p • q, p q + 9oP + P x q], 

where "•" and "x" are the vector dot and cross products. The 
conjugate of a quaternion is given by 

q = [2o,-q]; 

the squared norm of a quaternion is 



|q| 2 =qq = ql + ql + 



and its inverse is 



q- X = q/|q| 



Quaternions with q = 1 are called unit quaternions, for 
which we have q _1 = q. 

The quaternion q can also be represented as a 2 x 2 complex 
matrix, 



9o ^ 

-?2 



iqi 

- iQ3 



92 

9o 



iqi 



or as a 4 x 4 real matrix, 



Q(q) 



qo qi 92 93 

-9i 9o -93 92 

-92 93 9o -9i 

-93 -92 9i 9o 



(3) 



in these forms, quaternion multiplication becomes matrix 
multiplication. 

The notation we adopt here is to use light-face italics for 
scalar quantities, bold roman for 3-dimensional vectors and 
3x3 matrices, bold sans serif for quaternions and 4x4 ma- 
trices. Quaternion multiplication is indicated by pq, while 
"•" is used to indicate matrix-vector and vector- vector (includ- 
ing quaternion-quaternion) contractions and in this context q 
and v are treated as column vectors. Thus, we may write 
|q| 2 = q T • q. We also find that pq = p T • Q(q), with Q given 
by eq. 0. Consistent with eqs. Q and (0, we shall number 
quaternion indices starting at and vector indices from 1 . 



III. ROTATIONS 

The chief application of quaternions to molecular modeling 
lies in their use to represent rotations. Consider a unit quater- 
nion 



[cos(0/2),vsin(0/2)], 



(4) 



where |v| = 1, and define an operator i? q on 3-dimensional 
vectors by 

[0,i? q (x)] =q[0,x]q. (5) 
Multiplying out the quaternion product, we find 

i? q (x)=R(q)-x, 
where R(q) is the tensor 

R(q) = (^-| q | 2 )I + 2qq+2< ?0 Ixq (6) 
= vv + cos 0(1 — vv) + sin 01 x v, (7) 

where aa is the parallel projector [(aa) • b = (a • b)a] and 
I x a is the cross operator [(I x a) • b = a x b] 0, §113]. 
Equation is the conventional tensor representation for a 
right-handed rotation of about an axis v through the origin 
126]. Equation (|6jl may be written in component form as 



R(q) = 

1 — 2g| — 2q% 2qiq 2 - 2q a q 3 2qiq 3 + 2q a q 2 

29291 + 2go93 1 — 2gf — 2q\ 2q 2 q 3 - 2q q 1 

2q 3 q 1 - 2q q 2 2q 3 q 2 + 2g 9i 1 - 2g 2 - 2q\ 



(8) 



The definition, eq. (0, gives i? p (i? q (x)) = i? pq (x), so that 
pq corresponds to composing rotations (with the rotation by q 
performed first). We also find that i? q = i?_ q ; i.e., q and — q 
give the same rotation — changing the sign of q is equivalent to 
increasing by 2n in eq. ®. Unit quaternions satisfy q$+q\ + 
q\ + q\ = 1 and the quaternion representation of rotations are 
as points on a hypersphere § 3 with opposite points identified. 
For future reference, we note that the (three-dimensional) area 
ofS 3 is 2tt 2 . 

Because q and — q give the same rotation, some care needs 
to be taken when comparing two orientations represented by 
q a and q&. The rotation, q = qbq^", moves from q a to q^. 
When inverting eq. © to determine the rotation angle be- 
tween the two orientations, we should, if necessary, change 
the sign of q to ensure that qo > 0, so that 6 [0,7r]. A 
simple metric for closeness is given by cos(0/2) = |qj • q& . 

Describing rotations with quaternions has a number of ben- 
efits. They offer a compact representation of rotations. Com- 
pared to Euler angles, they are free of singularities. Rotations 
may be composed more efficiently using quaternions than by 
matrix multiplication. Also in contrast to rotation matrices, it 
is easy to maintain a quaternion's unit normalization (merely 
divide it by |q|). However the chief benefit is that the repre- 
sentation of a rotation as point on S 3 allows us to derive many 
important results concerning rotations in a simple coordinate- 
free way. 

There is one application where the matrix representation of 
rotations is more efficient that the quaternion representation. 
If we wish to apply the same rotation to many points, then we 
should form the rotation matrix using eq. and transform 
the points by matrix multiplication. 

The conventional representation for rotations that is most 
closely allied to quaternions is the axis-angle representation, 
where the rotation is given by a vector s = 0v which denotes 
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a rotation of 9 = |s| about an axis v = s/ \s\. It is useful 
to have an analytic relation between the quaternion and axis- 
angle representations and this is provided by the quaternion 
exponential 1 1], 

exp([0,s/2]) = q, (9) 

where q is given by eq. @, This definition of the exponen- 
tial follows from its series expansion. Similarly the inverse 
operation is given by the quaternion logarithm 

lnq = [0,s/2 + 2ttov], (10) 

where n is an integer. 

It is useful here to make a distinction between "orientation" 
and "rotation". We imagine that our molecule has some ar- 
bitrary but definite reference state. We apply a rotation and 
a translation (jointly referred to as a "displacement") to this 
reference state and so bring the molecule to a new orientation 
and position (jointly referred to as a "configuration"). 

IV. LEAST-SQUARES FIT 

Given two conformations of the same molecule, it is often 
useful to be able to determine how close the conformations 
are. In order to do this, we can rigidly move one conforma- 
tion so that it nearly coincides up with the other and then de- 
termine the difference in the positions of the corresponding 
atoms. Thus, if we are given two sets of atomic positions 
{xfc} and {yk} with k e [1,N] together with a set of atomic 
"weights" {wk}, we wish to determine the (rigid) displace- 
ment T which minimizes 

where W = J^k w k- Here Wk is merely a statistical weight 
of an atom — it is not necessarily related to the atomic mass. 
The two sets of atomic positions are ordered which presumes 
that we can identify corresponding atoms. (This is not neces- 
sarily a simple matter, if, for example, we are dealing with 
a molecule with several identical branches.) The displace- 
ment T — (q, d) may be expressed as a rotation about an 
axis through the origin followed by a translation, i.e., T(x) = 
i? q (x)+d. 

This problem has been considered by many authors and a 
review of various approaches is given by Flower [7]. Using 
quaternions to describe the rotation leads to an elegant and ro- 
bust solution. An early use of quaternions in this context is 
to solve the problem formulated by Wahba |8; 9], the deter- 
mination of the attitude of a spacecraft given the directions of 
several objects relative to the craft. The resulting "q-method" 
is described by Keat [10, §A.3] and by Lerner O §12.2.3] 
who both credit the invention of the method to Paul B. Dav- 
enport (1968). The generalization to matching points (as op- 
posed to directions) was considered by Faugeras and Hebert 
111211 who independently found the same method for determin- 
ing the orientation. Their method was subsequently rediscov- 
ered by Horn Q, by Diamond Q, and by Kearsley fH . 



The derivation of Faugeras and Hebert is one of the clearest, 
and we briefly summarize it here including the straightforward 
generalization of including arbitrary weights w k - 

If we demand that the variation of E with respect to d van- 
ish, we find that 

d= (y) -i? q ((x)), (12) 
where (...) denotes the sample average, 

(*> = 4 (13) 

fc 

Equation (lilt may now be written as 

E = ^Y. w *\y'k- R ^k)\\ d4) 

fc 

where x' fe = Xfc - (x) and y' k = y k - (y). Using eq. (0, 
eq. (TBI , becomes 

^^EH^y'fcl-qMqf- (is) 

fc 

Because, the norm of a quaternion is unchanged on multiply- 
ing it by a unit quaternion, we may right-multiply the kernel 
of eq. (1151 by q to give 

e = ^Y, w ^y'k^- ^°yk}\ 2 - (i6) 

fc 

We need to minimize eq. (1161 subject to the constraint |q| = 1. 
Because the kernel is linear in q, it can be written as 

[0,yk]q-q[0,x' fe ] = A fc q, (17) 

where Afc is a 4 x 4 skew matrix 

Afc = A(yj^ +x' fc ,yj b -xj b ), 

with 

*c-)-(£rr.) 




-bi -b 2 -63 

61 -a 3 a 2 

b 2 a 3 — ai 

&3 — 02 a\ 



Substituting this into eq. (I16> . we obtain 

E = -^Y,w k q T ■flJ k -A k -q = q T -B-q, (18) 

where B = (Aj • Afc) is a 4 x 4 symmetric matrix which has 
real eigenvalues, < Ao < Ai < X2 < A3. Setting q to the 
eigenvector corresponding to Ao gives the minimum value for 

E = A . 

In summary, the best fit is achieved by subtracting the mean 
positions from the original sets of points to give {x' fc } and 
{y' k }, forming the matrices Afc and B, and determining the 
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minimum eigenvalue A of B. The optimal rotation is given 
by setting q to the corresponding eigenvector of B and the 
optimal translation is found from eq. (I12> . The mean squared 
error for this fit is Ao. 

This procedure has two attractive features. The rotation 
obtained is a proper rotation (without an inversion); this is 
usually the desired result. Secondly, degenerate molecules 
are treated satisfactorily. For example if one or both of the 
sets {xfc} and {y^} is collinear, then the best fit is no longer 
unique. The result will be that there will be multiple minimum 
eigenvalues of B with distinct eigenvectors. The general solu- 
tion is obtained by setting q to a linear combination of these 
eigenvectors. The method does require finding the eigenval- 
ues and eigenvectors of a 4 x 4 matrix. However there are 
many numerical libraries Jl6l[l7t Il8ll which solve such prob- 
lems and the results are accurate to round-off for small sym- 
metric matrices such as B. A fast method of determining just 
the required eigenvector and associated eigenvalue in order to 
determine the attitude of a spacecraft is given in fl9l §111]. 
However, in applications to molecular modeling, it is proba- 
bly preferable merely to invoke a library eigenvector routine. 

Horn fl3ll considered including a scaling in the transfor- 
mation T in eq. (lilt . This is quite easily accommodated. 
However there seems little need to include such an effect in 
molecular modeling. 

Diamond |20] considers the case where inversions are al- 
lowed. This is easily achieved by substituting — x' fc for x' fe in 
eq. (114-1 . Equation dl 81 then involves a matrix B' where 

B' = 2(|xf + |yf)l-B. (19) 

Consequently the rotation giving the best inverted fit is the 
eigenvector with the greatest eigenvalue of B, A3 . Because the 
sum of the eigenvalues of B is its trace, 4(|x'| 2 + |y'| 2 ), we can 
express the mean squared error for the inverted fit as ^(Ao + 
Ai + A2 — A3). Thus, once the eigenvalues of B have been 
computed we immediately determine whether the inverted fit 
will be better than the proper fit. 

Coutsias et al. |21] provide an interesting extension of this 
method. Suppose the atomic positions {x^.} represent a model 
of a molecule which depends on a set of parameters {a^}, 
for example, the torsion angles of a protein backbone. By 
considering the gradient of E in parameter space dE/dai, 
they provide a method for determining the parameter values 
which result in the best fit to a given crystal structure. 

One other interesting consequence of the result for the best 
fit is that the rotation is not a continuous function of the con- 
figurations of the molecules. Let us suppose that {x^} gives 
the position of the atoms in a molecule in some predefined 
configuration and suppose that {yk} gives the atom positions 
during the course of a dynamical simulation of the molecule. 
If the forces acting on the atoms are finite then y^ is a C 1 
function (twice differentiable). During the course of the de- 
formation of the molecule, B and its eigenvalues change. In 
the typical case, the two smallest eigenvalues exchange roles 
and q switches from one direction in M 4 to an orthogonal di- 
rection. This results in the orientation of the best fit changing 
discontinuously by 180°. 

In modeling a flexible molecule, it is frequently useful to 



separate the external degrees of freedom, namely position and 
orientation, from the internal degrees of freedom. This allows, 
for example, translational and rotational symmetry to the sys- 
tem to be enforced and correlations between the motions of 
atoms within a molecule to be studied. This begs the ques- 
tion of how best to define the position and orientation of a 
molecule. Taking the position to be the center of mass is often 
the obvious choice. The position (so defined) evolves accord- 
ing to Newton's second law driven by the total force on the 
molecule. It is not possible to keep track of the orientation in 
an analogous fashion by integrating the total angular momen- 
tum, because flexible bodies can change their orientation with 
zero angular momentum — witness the ability of a cat always 
to land on its feet. A possible definition of the orientation is 
the best fit orientation to a reference conformation; i.e., we 
define or(A) as the best fit orientation, expressed as a quater- 
nion, of the molecule in conformation A relative to a refer- 
ence conformation R. Here again this choice has the attractive 
feature that the whole molecule is included in the definition. 
There are two problems with this prescription. Firstly, the dif- 
ference in orientations between two conformations A and B 
depends, in general, on the choice of reference conformation, 
namely 

o R (B)o R (A)^o s (B)o s (A). 

(This is easily demonstrated for simple triatomic molecules.) 
Thus this definition of orientation entails a degree of "arbi- 
trariness" absent in our definition of position. A second more 
serious defect arises from the discussion in the previous para- 
graph. Recovering the actual configuration of the molecule 
from the orientation defined in this way is numerically un- 
stable (by a flip of 180°!) whenever the lowest eigenvalues 
cross. This would also lead to large and discontinuous appar- 
ent internal motions of the molecule with small changes in the 
atoms' true positions. A better choice would therefore be to 
make the fit to some rigid (or nearly rigid) subcomponent of 
the molecule l22ll . Although this still yields an arbitrary def- 
inition of orientation (depending on the choice of reference 
subcomponent), the resulting orientation varies continuously 
under continuous deformations of the molecule. An extensive 
discussion of how to separate the orientation from the inter- 
nal motions of a flexible molecule is given by Littlejohn and 
Reinsch |0. 

V. INTERPOLATING ROTATIONS 

The power of the quaternion representation of rotations is 
evident when we consider the problem of interpolating be- 
tween two orientations of a molecule. (This application might 
arise in the animation of a molecular simulation.) Suppose 
we wish to interpolate between q a and q^. Because these 
quaternions and their interpolants lie on the unit sphere § 3 , 
the shortest path will be a great circle whose parametric equa- 
tion is given by [24] 

= q B 8in(fl-fl+q»Bin(fl 
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where cos 8 = • qj,. In the computer animation commu- 
nity this "spherical linear interpolation" operation is denoted 
by Slerp(q a , qt,: u) — q(u9) 1 24] . As <f> is increased from 
to 2ir, q(<fi) becomes successively q a , q/,, — q a , —qb, and fi- 
nally returns to q a . During this operation the corresponding 
3-dimensional rotation has increased by 4ir. If qj ■ q& > 0, 
then < < 6> takes q(4>) smoothly from q a to q;,. If, on 
the other hand, qj ■ qb < 0, then a shorter path is found with 
> 4> > 9 — 7r which takes q(<fi) smoothly from q a to — qb. 

Equation i20\ is derived using simple geometrical argu- 
ments applied to S 3 and the same result is obtained for the 
great-circle interpolation for S™. For S 3 , the result can also be 
expressed as 

q(0) = (q b q7) 0/ V. 

This relation has the interpretation: rotate to q a and then rotate 
a fraction <f>/9 to the path from q a to q?,. The operation q" is 
defined by Ql 

q" = exp(u lnq). 

In fact this interpolation scheme results in the molecule un- 
dergoing rotation at constant angular velocity. In order to 
show this, consider a body rotating at uj about a unit axis 
v. The evolution of the orientation q satisfies the differential 
equation 

q = [0, (w/2)v] q. (21) 

This is easily solved (e.g., by using finite differences and pass- 
ing to the limit St — > 0) to give 

q(i) = exp([0,M/2)v])q(0) 

= [cos(wi/2),vsin(wt/2)]q(0), 

which agrees with eq. J20i with the substitutions (f> = ojt/2, 
q„ = q(0)and q h = [0, v] q(0). 

If we wish to interpolate between two configurations of a 
rigid molecule, we are free to specify a point, xo, in the ref- 
erence molecule which will move with constant velocity. If 
the initial and final configurations are given by T a = (q a , d a ) 
and Tb = (q^df,), with qj • q/, > 0, then the required in- 
terpolation is achieved by increasing u from to 1 with the 
orientation given by q(u0) and the translation given by 

(d a + i? qa (x ))(l -u) + (d b + R qb (x ))u - R q{ud) (x ). 

VI. MEAN ORIENTATION 

The mean of directional quantities has frequently presented 
difficulties [25]. Let us assume we have N samples of some 
directional quantity with weights Wk for k E and 
Sfe w k — W. In the case where the samples are angles (e.g., 
the dihedral angles of a molecular bond) or directions (e.g., 
the orientations of a diatomic molecule), there is a well estab- 
lished procedure |25, §2.2.1, §9.2.1]: express the directions 
as unit vectors in R 2 or R 3 , n^, and determine (n) where we 
take the sample average according to eq. Jl 3i . Now the mean 



direction is given by ((n)) = (n)/ |(n)|, while 1 — |(n)|, a 
quantity lying in [0, 1], is the "circular variance" 1251 §2.3.1] 
or "spherical variance" lEM §9.2.1]. Here (. . .) is defined as 
a simple weighted arithmetical average, eq. Jl 3i . while ((. . .)) 
denotes the physically relevant mean of a quantity. 

This procedure cannot be directly applied to unit quater- 
nions used to represent rotations because of the indistin- 
guishability of ±q. Instead, we view {qfc} as axes l25l 
§1.1, §9.1] in R 4 , and define ((q)) as the unit quaternion about 
which the weighted moment of inertia of {qfc} is minimum 
ll26l §3]. Thus we wish to minimize 

= -^E^«i» T -(i-qfcqI)-«q» 

k 

= <(q)) T .(l-(qq T )).<(q)). 

The minimum value of L is given by the minimum eigenvalue 
of I — (qq T ) and ((q)) is corresponding eigenvector. The re- 
sulting L, which is a quantity lying in [0, |], then provides a 
measure of the variance of the rotations. This definition of 
the mean has a number of desirable properties: it is invariant 
when the signs of the qfe are changed; it is independent of the 
order of the samples; and it transforms properly if the samples 
are transformed. 

This prescription can also be applied to determine the mean 
direction of objects whose symmetry makes n and — n in- 
distinguishable (for example, the orientation of the diatomic 
molecule N2). 

Suppose we wish to determine the mean configuration of a 
rigid molecule, i.e., the mean of {T^ = (q/., d/.)}. We are free 
to choose a point xo in the reference molecule whose position 
in the mean configuration coincides with its mean position. 
(Compare this with the discussion of interpolating configura- 
tions in the previous section.) A suitable definition for the 
mean configuration is then 

«T)) = («q», (d> + (i? q (x )) - i? ((q)) (x )). (22) 

Frequently, we need more precise information about the 
distribution of configurations than its variance. We might need 
to know how much the rotation about different axes are corre- 
lated or whether rotational and translational motions are cou- 
pled. It is also desirable to be able to fit model distributions to 
a set of samples. For these purposes, it is useful to be able to 
map rotations onto R 3 so that standard statistical tools can be 
employed. We require that the mapping be measure preserv- 
ing (constant Jacobian) to simplify the use of the transformed 
rotations. 

We have already introduced the axis-angle representation 
of rotations. We may make the restriction |s| < tt and so map 
the hemisphere qo > of S 3 onto a ball of radius tt in R 3 . 
Unfortunately, the mapping, eq. l|9}, does not have constant 
Jacobian. We can correct this by defining l27ll a new "turn" 
vector u with the properties 

u || s, (23a) 
H - (M^M)" 3 . (23b) 
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This is an extension of the Lambert azimuthal equal-area pro- 
jection providing a measure-preserving mapping of the hemi- 
sphere g > of S 3 onto the unit ball B 3 . Equation d23l 
is well behaved at the boundary, |u| = 1; however on this 
boundary antipodal points are identified. The inverse mapping 
has an infinite derivate at |u| = \phl for integer n^O which 
corresponds to shells in u space which map to the origin. This 
inverse of eq. A23I is easily implemented via Newton's method 
supplemented by a Taylor series at the origin and at \phi. 

This mapping was introduced |27] to allow distributions of 
orientations to be fit using a mixture of Gaussians [28]. Given 
a set of sample orientations {q^}, we compute the mean orien- 
tation, ((q)). The deviations of the samples from the mean are 
then given by the rotations {qfe((q))} and these are mapped to 
a set of turns {u^}. Because these are points in R 3 , we may 
fit them with a 3-dimensional Gaussian with zero mean and 
with covariance matrix (u T u). 

This procedure can be extended to fits of molecular config- 
urations. In this case the deviations from the mean configu- 
ration, eq. (I22> . is mapped into a point in R 6 ; the resulting 
Gaussian fit will capture the correlation between the transla- 
tional and rotational degrees of freedom. 

In closing this section, we mention an alternative way of 
fitting quaternion orientational data with analytic functions, 
namely in terms of spherical harmonics. The normal (3- 
dimensional) spherical harmonics can be generalized to 4 (and 
higher) dimensions 1 29; 30] and the orthogonality relation al- 
lows the coefficients of the harmonics to be computed sim- 
ply. The ±q symmetry merely results in the odd harmon- 
ics dropping out. However in typical molecular interactions, 
the relative orientation of the molecules is tightly constrained 
which means that a large number of spherical harmonics will 
be needed to represent the orientational distribution. For such 
applications, a representation in terms of localized functions, 
such as Gaussians, is preferable. 

VII. RANDOM ORIENTATION 

In Monte Carlo simulations 1 3JJ] it is sometimes necessary 
to select a molecule with a random and uniform position and 
orientation, for example, when attempting to insert a molecule 
into a simulation box during a grand canonical simulation 
ll32ll . Choosing a random position is straightforward. How- 
ever, we need to be careful to select the random orientation 
uniformly or else detailed balance will be violated (when bal- 
ancing insertions and deletions). One possibility is to choose 
a random turn u in B 3 and to convert this to a quaternion. 
However, it is much simpler to sample directly in quaternion 
space. 

Let us first establish the requirement for "uniform" sam- 
pling of orientations. Composing 3-dimensional rotations is 
carried out by the multiplication of unit quaternions; but we 
know that pq = p T • Q(q), where Q(q), given in eq. 0, is 
orthonormal if q is a unit quaternion. Thus 3-dimensional ro- 
tations map into a rigid rotation of § 3 ; a uniform density on § 3 
is invariant to such rotations. It follows that the task of sam- 
pling a random orientation reduces to picking a random unit 
quaternion uniformly on § 3 . 



Marsaglia 1 33] provides one prescription: select x\ and y\ 
uniformly in (—1, 1) until s\ = x\ + y\ < 1; similarly, select 
X2 and y2 uniformly in (—1, 1) until S2 = x\ + y\ < 1; then 

q = [x 1 ,y 1 ,x 2 \/(l - s 1 )/s 2 ,y2\/( 1 - s l)/ s 2] 

is uniformly distributed on S 3 . 

A more transparent and symmetric method (which general- 
izes to sampling points on §" |34, §7.1]) is to pick 4 normal 
deviates gi for i e [0, 4) and to set 

P = [.9o,ffi,52,ff3], q = p/|p|. 

Although this method is less efficient than Marsaglia's, the 
overall impact in the context of a molecular simulation is 
probably tiny. Both of these methods return points uniformly 
over the whole of § 3 rather that over just one hemisphere. In 
most applications, this is of no consequence. 

Other representations of rotation yield more complex rules 
for obtaining random orientations. For example, with Euler 
angles, we would sample uniformly the first and third angles 
and the cosine of the second angle. If the orientation is given 
in axis-angle space, s, then the axis, s/ \s\, should be chosen 
uniformly on § 2 , and the rotation angle, |s|, should be sampled 
from [0, 7r] with probability (2/n) sin 2 (|s| /2). Of course, this 
simplifies when s is transformed to u space, eq. ( 1231 . leading 
to a uniform distribution in B 3 . 

A related problem is selection of random rotational moves 
for use in a Monte Carlo simulation H3ltl . This method re- 
quires that detailed balance be satisfied, which, in the ab- 
sence of torque bias, means that the probability of selecting 
the new orientation is symmetric under interchange of old and 
new orientations. Because we are typically interested in small 
changes in orientation, it is most convenient to select the ro- 
tation in axis-angle space as exp([0,s]) and to set the new 
orientation 

q' = exp([0,s])q, 

where s is selected from an even distribution, p(s) = p(— s). 
(This result follows because the Jacobian factor is even in s.) 
Usually, we wish the choice of rotation axis to be isotropic, 
and in that case we have p(s) = p(\s\). Thus we might select 
s uniformly in a sphere of radius A. Rao et al. |35] select 
|s| uniformly in [0, A] (which results in a distribution which 
is singular at the origin in s space). An attractive choice of 
distribution is a 3-dimensional Gaussian 

^_ cxp(-l|s| 2 /A 2 ) 
P[S) ~ (2tt) 3 / 2 A 3 • 

Not only is this simple to sample from, but it allows torque 
bias to be included in a simple manner. Torque bias is imple- 
mented 1 35] by multiplying the a priori probability of select- 
ing a move by exp(A/3 t T ■ s), where (3 is the inverse tempera- 
ture, A is a constant (usually taken to be i), and t is the torque 
on the molecule. If the "starting" distribution is a Gaussian 
then the torque-bias factor merely shifts the Gaussian to give 

^_ cxp(-l|s~A/3A 2 t| 2 /A 2 ) 
P[S) ~ (2tt) 3 / 2 A 3 
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This offers two simplifications over the original procedure 
ll35ll : (a) it is trivial to sample from a shifted Gaussian; and 
(b) the acceptance probability, which involves the ratio of the 
forward and reverse a priori probabilities, is also easy to com- 
pute and, in particular, it does not require the evaluation of a 
normalization factor for the distribution. Similar considera- 
tions obviously apply to the application of force bias for trans- 
lational moves, as has been discussed by Rossky et al. |36]. 
Indeed, in the case of moving molecules, we would naturally 
perform a combined translational and orientational move ap- 
plying both force- and torque-bias simultaneously. There are 
often strong gradients in the forces in molecular simulations 
and a direct application of force bias in this case can lead to 
poor sampling because certain transitions are effectively dis- 
allowed. In such cases, it is prudent to limit the effect of the 
bias by limiting the shift in the Gaussian, if necessary, to en- 
sure that there a finite probability (at least 5-10%, say) of the 
sampled move being in the opposite direction to the force. 
This ensures that the molecule can effectively explore con- 
figuration space because small steps are always permitted and 
it provides a simpler "safety" mechanism than the distance 
scaling of A proposed by Mezei (3- 

Finally, some care needs to be taken to treat the possibility 
of the orientation "wrapping" around. Suppose the sampled s 
has |s| > 7T, then the resulting orientation is identical to the 
wrapped one, s — 2tts/ |s|. To ensure that detailed balance is 
maintained, the acceptance probability should use the a priori 
probability for the reverse move — s (rather than the negative 
of the wrapped move). A simple expedient for avoiding this 
problem is simply to reject any move with |s| > it. 

VIII. GRIDS FOR ORIENTATION 

In many contexts, it is important to be able to represent the 
independent variables for a problem on a grid. It is therefore 
useful to be able to map orientations onto a grid. Possible 
applications are binning molecular data, implementing cavity 
bias in orientation 1 38], performing systematic searching of 
orientations (where the goal is to provide more regular cov- 
erage of orientation space than is achieved by random sam- 
pling), and performing integrals over orientation by numeri- 
cal quadrature 1 39] . Our goal is to provide a simple rule for 
covering orientation space with a grid while ensuring that the 
grid elements are approximately of equal volumes and are not 
unduly distorted. Here again, representing the orientation as a 
quaternion provides a reasonable solution. 

Recall that unit quaternions lie on a hypersphere § 3 . Posi- 
tions on § 3 can be determined by 3 angle-like variables. How- 
ever these are a poor basis for a grid because of singularities 
in the resulting coordinate system. Instead let imagine sur- 
rounding § 3 by a tesseract (the 4-dimensional analogue of the 
cube) of edge length 2. This consists of 8 cells which are 
2x2x2 cubes tangent to S 3 . An exemplary cell is given by 
p with po = 1, |pi^o| < 1- We need only consider half of 
the cells of the tesseract because of the identification of ±p. 
Thus we choose to consider the four cells for which one of the 
components of p is +1. 

This then forms the basis for a cubical grid for orientation 



space. This is attractive because cubical grids are simple to 
index into; they are easy to refine; they have an metric fac- 
tor which is easy to compute; etc. The overall "wastefulness" 
of this grids relative to a cubic grid within a domain of M 3 is 
given by the ratio of the volume of four cells of the tesseract 
(4 x 2 3 ) to the area (really a volume) of a hemisphere of § 3 , 
i.e., 32/ir 2 sa 3.24. This might seem rather profligate. How- 
ever, if we managed to arrange the grid around the § 3 without 
any wastage, the grid edge would be reduced by a factor of 
only ^24 » 1.48. 

Let us divide each of the cells of the tesseract into M 3 grid 
cubes (of side 2/M). These cubes can then be projected to 
S 3 by scaling p to a unit quaternion. This operation scales 
the volume of each of the grid cubes by p 4 — a factor of 
|pp 3 is due to scaling a volume element linearly by |p| 1 and 
the last factor of |p| 1 arises from the distortion of the cube 
during this operation. The maximum scaling occurs at the 
corners of the tesseract, e.g., p = [1, 1, 1, 1], where p = 2, 
so that range of volumes for the grid elements is 16 with the 
maximum distortion being a factor of 2. Mapping between an 
arbitrary orientation q and a point in the grid is then achieved 
as follows. We identify the component qi of q which is largest 
in absolute value and set p = q/qi, giving pi = 1 and p^i £ 
[—1,1]. The grid then consists of 4 x M x M x M elements. 
The resolution of the grid, given by the maximum change in 
orientations between neighboring grid cells, is approximately 
4/M. (We need to multiply the grid cube edge by 2 to obtain 
the equivalent rotation angle, because, from eq. @, we have 
q = [l,v0/2] for small.) 

When the application is quadrature, it is natural to evaluate 
the function and to compute the metric factor |p| 4 at the cen- 
ters of the grid cubes. For binning, we assign the samples to 
the grid cube in the obvious way and again use the grid center 
to compute the metric factor to obtain a sample density. 

The cubical grid defined above is suitable for quadrature 
and searching where the cost of function evaluations is small. 
Sometimes, however, the cost of function evaluations is so 
high that it is desirable to find an "optimal" set of grid points. 
For integrations over S 2 , this is a well-studied problem 1 4-0] 
and various integration grid have been given that ensure ac- 
curacy to high order B41I1 . For § 3 , various spherical ^-designs 
are known lE^LE^ll . A i-design is a set of points on the sphere 
such that the average of a polynomial of degree t over the 
sphere is given by averaging the value of polynomial at those 
points. Unfortunately i-designs for 8 3 are only known for t 
up to 1 1 with the 1 1 -design corresponding to 60 orientations. 

In order to provide a denser coverage of the sphere we pro- 
pose the following strategy: Consider N sample orientations, 
corresponding to 2A^ points on § 3 . Define a "covering radius", 
a, as the maximum rotation needed to align an arbitrary ori- 
entation with one of the sample orientations. The "coverage", 
c, is defined by the ratio of the area of 2N spherical caps of 
rotational extent a to the total area of S 3 , i.e., 

c = Ar(Q ' ShlQ) (24) 

7T 

[compare with eq. (I23b )l. For a given N, the optimal config- 
uration of sample orientations is obtained by minimizing a — 
this gives the "thinnest" coverage, c. Finally, we weight each 
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TABLE I Coverings of orientation space. N is the number of orien- 
tations; a is the covering radius (expressed as a rotation), and c is the 
coverage, eq. <24t . 

N a c notes 

24 62.80°" 1.579 vertices of 2 24-cells, a 7-design 6 

60 44.48 oc 1.445 vertices of 600-cell, an 11 -design* 

360 27.78° 2.152 vertices and cells of 600-cell 1 ' 

50 69.66° 4.426 ZCW3.50 f 

538 32.53° 5.142 ZCW3.538 5 

6044 18.10° 10.051 ZCW3.6044 C 

oo 5.499/ tyN 8.821 7 cubic lattice in tesseract s 

oo 4.472/ v^/V 4.745 ,! body-centered cubic lattice in tesseract 8 

00 4.092/ ?/~N 3.635' cubic lattice in 48-cell' 

00 3.328/ \/N 1.956* body-centered cubic lattice in 48-celF 

648 20.83° 1.641 c48u27' (5 = 0.33582) 

7416 10.07° 2.133 c48u309 ; (5 = 0.15846) 

70728 4.71° 2.078 c48u2947' (5 = 0.07359) 

00 3.022/ \/N 1.464™ uniform body-centered cubic lattice" 

"cos- 1 (i(2v^-l)). 
6 See 0. 

''cos- 1 (i(3%/5-l)). 

rf 60 vertices with weight 1.32870 and 300 cell centers with weight 0.93426. 
c Euler angles for orientations taken from l39ll . 

'Ratio of maximum to minimum weights is 16. 

*20x/5/(3tt). 

f 224v%/[(17 + 12v / 2)tt]. 

^Ignoring boundary effects, ratio of maximum to minimum weights is 
64/(17 + 12V2) = 1.884. 
*280V5/[3(17 + 12V2)tt]. 

'Body-centered cubic lattice in a 48-cell with lattice spacing S; see l4dl . 
m 5\/57r/24. 

"Conjectured thinnest covering for N —* 00, based on optimal covering of 
M 3 Ej. 

sample point according to the fraction of orientational space 
which is closest to it (i.e., in proportion to the volumes of the 
Voronoi cells); and we set a secondary goal of minimizing 
the variation in the weights. We expect the resulting sample 
points and weights to provide robust and accurate estimates of 
orientational integrals — particularly of experimentally or nu- 
merically determined functions which are bounded but which 
may not have bounded derivatives. The sample points are also 
suitable for searching orientation space optimally. Finding 
such optimal sets of points is difficult in practice. So, here, we 
propose some sets based on the regular 4-dimensional poly- 
topes I44H45I1 . with the results summarized in tableU 

The 24-orientation set is obtained by placing two 24-cells 
(or icositetrachora) in their mutually dual configurations to 
give the set 

8 permutations of [±1,0,0,0], (25 a) 

1 6 permutations of r±i±|,±~,±~], (25b) 
24 permutations of [± -4- , ± , 0, 0] . (25c) 

(Each orientation is counted twice here because of the identi- 
fication of ±q.) The corresponding Voronoi tessellation is a 



truncated-cubic tetracontaoctachoron (or 48-cell) which con- 
sists of 48 regular truncated cubes 1 48] . The set of orienta- 
tions, eq. d25i . is the direct symmetry group for the cube. 

The vertices of the 600-cell (or hexacosichoron) ll45ll are 
given by eqs. i25k ) and ( 125b ) together with 

96 even permutations of [± ^±1 , ± ^pl , ± \ , 0] . 

In this case, the Voronoi tessellation is the dual of the 600- 
cell, namely the 120-cell (or hecatonicosachoron). Because 
the Voronoi cells are dodecahedra which are nearly spherical, 
the resulting 60 orientations gives a particularly thin covering 
of orientation space. A good covering is also provided by 
adding the centers of the tetrahedral cells of the 600-cell. 

For comparison, we list in table H] the data for some of the 
ZCW3 orientation sets used by Eden and Levitt |3Sj- These 
are obtained by the taking sets of points appropriate for inte- 
grating of a periodic unit cube 1 49: 15(1 15111 and mapping this 
set to the space of 3 Euler angles. There are two potential 
problems with this approach: (1) even though the metric of 
orientation space is treated properly, the mapping from Eu- 
ler angles to orientation space is not distance-preserving and 
we expect this to degrade the properties of a mesh; and (2) 
because one of the Euler angles is not periodic, functions in 
orientation space do not obey the constraints assumed in con- 
structing the sets of sample points. (More complete data for 
the ZCW3 sets is available in Efijl .l 

Finally, table |J]provides various strategies for constructing 
an arbitrarily fine grid. We start with gridding the tesseract on 
which we easily impose a cubical grid (see above). However, 
the optimal sphere covering of M. 3 is body-centered cubic 1 47], 
and such a grid results in a thinner covering. Still better cov- 
erings can be found by starting with the 48-cell which has a 
typical cell (a truncated cube), 

P0 = 1, |JV0| < V2 - 1, \pi\ + \ P 2\ + \p 3 \ < 1. (26) 

The other cells are obtained by multiplying p by the members 
of eq. i25\ . A cubic or body-centered cubic grid can easily be 
placed within each cell. For example, a body-centered lattice 
can be obtained with 

Po = 1, p = [k,l,m]6/2, 

subject to the constraint eq. ( I26K where k, I, and m are either 
all even integers or all odd integers. Table U gives three ex- 
amples of such grids. The disadvantage of grids in 48-cells is 
that care must be taken to treat the faces of the cell correctly. 
The triangular faces of the truncated cubes slice cut through 
the grid cells at an angle and the octagonal faces fit together 
with a 45° twist. It is therefore necessary to resort to numeri- 
cal methods to determine the volume of the Voronoi cells near 
the faces. The resulting data for the weights and examples of 
other body-centered cubic grids in the 48-cell with a > 0.65° 
are given in |46j. 

One special searching problem is determining the volume 
of the smallest rectangular box (whose edges are parallel to 
the coordinate axes) into which a given molecule fits. This 
problem arises in the study of a single protein bathed in a sol- 
vent. In order to eliminate boundary effects, it is possible to 
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construct a periodic system and, for efficiency, we wish the 
volume of the periodic cell to be minimum. We can solve this 
problem by systematically sampling over all orientations us- 
ing our grid. However, because of the symmetries of a cube, 
eq. (1251 . there are 24 equivalent orientations which minimize 
the volume and we can restrict the search to 1/24 of orienta- 
tion space by searching only in eq. i26\ . We should point out 
that for the purposes of mimicking a single solute molecule 
in a solvent with a periodic system, the "best" computational 
box is not given by fitting a single image of the solute into a 
box but rather by the more challenging problem of optimally 
fitting the solute molecules into its neighboring images |52|. 

The emphasis in the section is on covering all orientation 
space with a grid. In many molecular modeling applications, 
the orientation may be quite restricted, e.g., when consider- 
ing the orientation of a ligand in a protein binding pocket, 
and we may elect to restrict the integration (or search) to a 
set of orientations which differ from the mean rotation by 
at most O. If we express the deviation from the mean as 
a turn vector, eq. ( I23> . integrations may be carried out in 
(three-dimensional) turn space with the range of integration 
restricted to the ball |u| < y/ (Q — sin9)/7r. Because the 
mapping to turn space is volume preserving, the integrals are 
exact. In addition, provided that < tt/2, the mapping to 
turn space entails little distortion (< 12%) and standard nu- 
merical methods for integrating in a ball B 3 can be used. 

IX. DISCUSSION 

Quaternions are an ideal "fiducial" representation IH of 
orientation in a molecular simulation. They provide an eco- 
nomical format for program input and output and as the inter- 
nal representation of orientation. There is little redundancy in 
the representation — there is just the normalization constraint 
on its four elements and this is easily tested and corrected. 
At a given numerical precision, quaternions cover orientation 
space uniformly. Most operations involving orientation can be 
carried out directly and efficiently with quaternions and they 
can be converted to other representations as needed. The basic 
operation of composing rotations is most cheaply performed 
with quaternions. On the other hand, if we need to rotate a 
large molecule it is quicker to convert the quaternion to a ro- 
tation matrix, eq. (jHJl, and to perform matrix-vector multipli- 
cation than to apply eq. Q directly. 

In comparison, other representation suffer serious draw- 
backs. Rotations cannot be easily composed when expressed 
as Euler angles. Picking a random orientation is more awk- 
ward when rotation matrices are used. In neither of these rep- 
resentations is it easy to interpolate between two orientations 
or to compute the mean orientation. 

Although quaternions may be unfamiliar to some readers, 
we only needed to use quaternion algebra in the rule for com- 
posing rotations and in deriving the least squares fit. In car- 
rying out the other tasks, we just used the fact that rotations 
are represented by opposite points on § 3 and this provides a 
"natural" metric for rotations. In working with S 3 , we are able 
to carry over geometrical concepts from § 2 or use straightfor- 
ward extensions from Euclidean space, M 3 , to M 4 . 



A curious and non-obvious property of rotations which is 
evident from their representation on § 3 , with ±q identified, is 
that rotations do not form a simply connected group. Thus, if 
we rotate an object by 360° it returns to its original orientation 
but with the sign of q changed. This means that we cannot 
continuously deform the path that the object took to reduce it 
to a point. However, we can do this if we rotate the object by 
720°. This property of rotations is an immediate consequence 
of their representation as a pair of points ±q on § 3 and good 
visual illustrations of this property are provided by the Dirac 
belt trick l54ll and the Phillipine wine dance 15^1 . 
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